Probabilistic base calling of Solexa sequencing data.

机译：Solexa测序数据的概率基础调用。

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology.RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads.CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

机译：背景：Solexa / Illumina短读超高通量DNA测序技术通过DNA菌落的合成并行测序产生数百万个短标签（最多36个碱基）。这种高通量数据的处理和统计分析提出了新的挑战；目前，由于无法将标签与参考序列进行匹配，因此通常会丢弃一部分标签，从而降低了该技术的有效吞吐量。结果：我们提出了一种基于模型的聚类和概率论来识别的新型碱基调用算法模棱两可的碱基，并用IUPAC符号对其进行编码。我们还使用基于信息内容的分数来选择最佳子标签，以消除靠近读物末端的不确定碱基。结论：我们证明，与Solexa的数据处理流程相比，该方法平均提高了基因组覆盖率和可用标签数量15％。提供了一个R软件包，可以快速，准确地对Solexa的荧光强度文件进行碱基检定，并提供有用的诊断图。

著录项

作者
Rougemont, J.; Amzallag, A.; Iseli, C.; Farinelli, L.; Xenarios, I.; Naef, F.;
展开▼
作者单位

展开▼
年度 2008
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Probabilistic base calling of Solexa sequencing data [J] . Jacques Rougemont, Arnaud Amzallag, Christian Iseli, BMC Bioinformatics . 2008,第1期

机译：Solexa测序数据的概率基础调用
2. Model-based quality assessment and base-calling for second-generation sequencing data. [J] . Bravo HC, Irizarry RA Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2010,第3期

机译：基于模型的质量评估和第二代测序数据的碱基检出。
3. BM-BC: a Bayesian method of base calling for Solexa sequence data [J] . Yuan Ji, Riten Mitra, Fernando Quintana, BMC Bioinformatics . 2012,第SUPPLEMENTa13期

机译：BM-BC：贝叶斯方法调用Solexa序列数据的基础
4. naiveBayesCall: An Efficient Model-Based Base-Calling Algorithm for High-Throughput Sequencing [C] . Wei-Chun Kao, Yun S. Song Research in computational molecular biology . 2010

机译：naiveBayesCall：高通量测序的基于模型的高效碱基调用算法
5. Statistical methods for genome variant calling and population genetic inference from next-generation sequencing data. [D] . Ma, Xin. 2011

机译：从下一代测序数据进行基因组变异调用和群体遗传推断的统计方法。
6. Probabilistic base calling of Solexa sequencing data [O] . Jacques Rougemont, Arnaud Amzallag, Christian Iseli, 2008

机译：Solexa测序数据的概率基础调用
7. Probabilistic base calling of Solexa sequencing data [O] . Iseli Christian, Amzallag Arnaud, Rougemont Jacques, 2008

机译：Solexa测序数据的概率基础调用

Probabilistic base calling of Solexa sequencing data.

摘要

著录项

相似文献

相关主题

期刊订阅